Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Speech separation algorithm based on convolutional encoder decoder and gated recurrent unit
CHEN Xiukai, LU Zhihua, ZHOU Yu
Journal of Computer Applications    2020, 40 (7): 2137-2141.   DOI: 10.11772/j.issn.1001-9081.2019111968
Abstract338)      PDF (830KB)(546)       Save
In most speech separation and speech enhancement algorithms based on deep learning, the spectrum feature after Fourier transform is used as the input feature of the neural network, without considering the phase information in the speech signal. However, some previous studies show that phase information is essential to improve speech quality, especially at low Signal-to-Noise Ratio (SNR). To solve this problem, a speech separation algorithm based on Convolutional Encoder Decoder network and Gated Recurrent Unit (CED-GRU) network was proposed. Firstly, based on the characteristic that the original waveform contains both amplitude information and phase information, the original waveform of the mixed speech signal was used as the input feature. Secondly, the timing problem in speech signal was able to be effectively solved by combining the Convolutional Encoder Decoder (CED) network and the Gated Recurrent Unit (GRU) network. Compared with Permutation Invariant Training (PIT) algorithm, DC (Deep Clustering) algorithm, Deep Attractor Network (DAN) algorithm, the improved algorithm has the Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI) of men and men, men and women, women and women increased by 1.16 and 0.29, 1.37 and 0.27, 1.08 and 0.3; 0.87 and 0.21, 1.11 and 0.22, 0.81 and 0.24; 0.64 and 0.24, 1.01 and 0.34, 0.73 and 0.29 percentage points. The experimental results show that the speech separation system based on CED-GRU has great value in practical application.
Reference | Related Articles | Metrics